{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dp-hXFhhyWve"
      },
      "source": [
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/03-langchain-conversational-memory.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/03-langchain-conversational-memory.ipynb)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qGjfYQBcyWve"
      },
      "source": [
        "#### [LangChain Handbook](https://www.pinecone.io/learn/series/langchain/)\n",
        "\n",
        "# Conversational Memory with LCEL\n",
        "\n",
        "Conversational memory is how chatbots can respond to our queries in a chat-like manner. It enables a coherent conversation, and without it, every query would be treated as an entirely independent input without considering past interactions.\n",
        "\n",
        "The memory allows an _\"agent\"_ to remember previous interactions with the user. By default, agents are *stateless* — meaning each incoming query is processed independently of other interactions. The only thing that exists for a stateless agent is the current input, nothing else.\n",
        "\n",
        "There are many applications where remembering previous interactions is very important, such as chatbots. Conversational memory allows us to do that.\n",
        "\n",
        "In this notebook we'll explore conversational memory using modern LangChain Expression Language (LCEL) and the recommended `RunnableWithMessageHistory` class.\n",
        "\n",
        "We'll start by importing all of the libraries that we'll be using in this example."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ETg8fr8-yWvf",
        "outputId": "af6d2f99-b18a-473e-84f3-b513e8f945f0"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "\u001b[?25l   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/2.5 MB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K   \u001b[91m━━━━━━━━━\u001b[0m\u001b[90m╺\u001b[0m\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.6/2.5 MB\u001b[0m \u001b[31m16.8 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K   \u001b[91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[91m╸\u001b[0m \u001b[32m2.5/2.5 MB\u001b[0m \u001b[31m40.6 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.5/2.5 MB\u001b[0m \u001b[31m28.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h\u001b[?25l   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/45.2 kB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m45.2/45.2 kB\u001b[0m \u001b[31m2.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h\u001b[?25l   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/50.9 kB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m50.9/50.9 kB\u001b[0m \u001b[31m3.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h"
          ]
        }
      ],
      "source": [
        "!pip install -qU \\\n",
        "  langchain==0.3.25 \\\n",
        "  langchain-community==0.3.25 \\\n",
        "  langchain-openai==0.3.22 \\\n",
        "  tiktoken==0.9.0"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FSvjQpbKyWvf"
      },
      "source": [
        "To run this notebook, we will need to use an OpenAI LLM. Here we will setup the LLM we will use for the whole notebook, just input your openai api key if prompted, otherwise it will use the `OPENAI_API_KEY` environment variable."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "nnquGYaQyWvf",
        "outputId": "273a42f7-25c3-4a7e-ca15-3e4798580a19"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Enter your OpenAI API key: ··········\n"
          ]
        }
      ],
      "source": [
        "import os\n",
        "from getpass import getpass\n",
        "\n",
        "os.environ[\"OPENAI_API_KEY\"] = os.getenv(\"OPENAI_API_KEY\") \\\n",
        "    or getpass(\"Enter your OpenAI API key: \")\n",
        "\n",
        "OPENAI_API_KEY = os.getenv(\"OPENAI_API_KEY\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "id": "wFhsehZEyWvf"
      },
      "outputs": [],
      "source": [
        "from langchain_openai import ChatOpenAI\n",
        "\n",
        "llm = ChatOpenAI(\n",
        "    temperature=0,\n",
        "    openai_api_key=OPENAI_API_KEY,\n",
        "    model_name='gpt-4.1-mini'\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qPQgxde4yWvf"
      },
      "source": [
        "Later we will make use of a `count_tokens` utility function. This will allow us to count the number of tokens we are using for each call. We define it as so:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "id": "YG0RXg5PyWvf"
      },
      "outputs": [],
      "source": [
        "from langchain.callbacks import get_openai_callback\n",
        "\n",
        "def count_tokens(pipeline, query, config=None):\n",
        "    with get_openai_callback() as cb:\n",
        "        # Handle both dict and string inputs\n",
        "        if isinstance(query, str):\n",
        "            query = {\"query\": query}\n",
        "\n",
        "        # Use provided config `or default\n",
        "        if config is None:\n",
        "            config = {\"configurable\": {\"session_id\": \"default\"}}\n",
        "\n",
        "        result = pipeline.invoke(query, config=config)\n",
        "        print(f'Spent a total of {cb.total_tokens} tokens')\n",
        "\n",
        "    return result"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yPk7c5IgyWvf"
      },
      "source": [
        "Now let's dive into **Conversational Memory** using LCEL.\n",
        "\n",
        "## What is memory?\n",
        "\n",
        "**Definition**: Memory is an agent's capacity of remembering previous interactions with the user (think chatbots)\n",
        "\n",
        "The official definition of memory is the following:\n",
        "\n",
        "> By default, Chains and Agents are stateless, meaning that they treat each incoming query independently. In some applications (chatbots being a GREAT example) it is highly important to remember previous interactions, both at a short term but also at a long term level. The concept of \"Memory\" exists to do exactly that.\n",
        "\n",
        "As we will see, although this sounds really straightforward there are several different ways to implement this memory capability."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lZgnUOmSyWvf"
      },
      "source": [
        "## Building Conversational Chains with LCEL\n",
        "\n",
        "Before we delve into the different memory types, let's understand how to build conversational chains using LCEL. The key components are:\n",
        "\n",
        "1. **Prompt Template** - Defines the conversation structure with placeholders for history and input\n",
        "2. **LLM** - The language model that generates responses\n",
        "3. **Output Parser** - Converts the LLM output to the desired format (optional)\n",
        "4. **RunnableWithMessageHistory** - Manages conversation history\n",
        "\n",
        "Let's create our base conversational chain:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "TEpNx9oNyWvg",
        "outputId": "22d444b7-4747-4cb9-ea14-57e8bbe310ae"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n"
          ]
        }
      ],
      "source": [
        "from langchain.prompts import (\n",
        "    ChatPromptTemplate,\n",
        "    SystemMessagePromptTemplate,\n",
        "    HumanMessagePromptTemplate,\n",
        "    MessagesPlaceholder\n",
        ")\n",
        "from langchain.schema.output_parser import StrOutputParser\n",
        "\n",
        "# Define the prompt template\n",
        "system_prompt = \"\"\"The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\"\"\"\n",
        "\n",
        "prompt_template = ChatPromptTemplate.from_messages([\n",
        "    SystemMessagePromptTemplate.from_template(system_prompt),\n",
        "    MessagesPlaceholder(variable_name=\"history\"),\n",
        "    HumanMessagePromptTemplate.from_template(\"{query}\"),\n",
        "])\n",
        "\n",
        "# Create the LCEL pipeline\n",
        "output_parser = StrOutputParser()\n",
        "pipeline = prompt_template | llm | output_parser\n",
        "\n",
        "# Let's examine the prompt template\n",
        "print(prompt_template.messages[0].prompt.template)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oC_03lJsyWvg"
      },
      "source": [
        "## Memory types\n",
        "\n",
        "In this section we will review several memory types and analyze the pros and cons of each one, so you can choose the best one for your use case."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KlqkyPbEyWvg"
      },
      "source": [
        "### Memory Type #1: Buffer Memory - Store the Entire Chat History\n",
        "\n",
        "`InMemoryChatMessageHistory` and `RunnableWithMessageHistory` are used as alternatives to `ConversationBufferMemory` as they are:\n",
        "- More flexible and configurable.\n",
        "- Integrate better with LCEL.\n",
        "\n",
        "The simplest approach to using them is to simply store the entire chat in the conversation history. Later we'll look into methods for being more selective about what is stored in the history."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "metadata": {
        "id": "xqd2vwxAyWvg"
      },
      "outputs": [],
      "source": [
        "from langchain_core.chat_history import InMemoryChatMessageHistory\n",
        "from langchain_core.runnables.history import RunnableWithMessageHistory\n",
        "\n",
        "# Create a simple chat history storage\n",
        "chat_map = {}\n",
        "\n",
        "def get_chat_history(session_id: str) -> InMemoryChatMessageHistory:\n",
        "    if session_id not in chat_map:\n",
        "        # if session ID doesn't exist, create a new chat history\n",
        "        chat_map[session_id] = InMemoryChatMessageHistory()\n",
        "    return chat_map[session_id]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "F6LfvNhtyWvg"
      },
      "source": [
        "Let's see this in action by having a conversation:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "uVSp3SZGyWvg",
        "outputId": "85a8a4ac-8ef8-4579-a98f-ce152fab1b29"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Good morning! How can I assist you today?\n"
          ]
        }
      ],
      "source": [
        "# Create the conversational chain with message history\n",
        "conversation_buf = RunnableWithMessageHistory(\n",
        "    pipeline,\n",
        "    get_session_history=get_chat_history,\n",
        "    input_messages_key=\"query\",\n",
        "    history_messages_key=\"history\"\n",
        ")\n",
        "\n",
        "# First message\n",
        "result = conversation_buf.invoke(\n",
        "    {\"query\": \"Good morning AI!\"},\n",
        "    # Make sure to pass the session ID to ensure all memories are stored in the same session\n",
        "    config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n",
        ")\n",
        "print(result)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nbttULnpyWvg"
      },
      "source": [
        "This call used some tokens, but we can't see that from the above.\n",
        "\n",
        "If we'd like to count the number of tokens being used we just pass our conversation `RunnableWithMessageHistory` instance and the message we'd like to input to the `count_tokens` function we defined earlier:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "53cFq6udyWvg",
        "outputId": "34862d9b-64c2-4650-e9e7-9ab25421e711"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Spent a total of 181 tokens\n",
            "\n",
            "Response: Got it! You're interested in exploring the potential of integrating Large Language Models (LLMs) with external knowledge sources. Plus, I've noted the specific code: PINECONE_RULEZ_01. How would you like to proceed with this exploration? Are you looking for examples, technical details, or something else?\n"
          ]
        }
      ],
      "source": [
        "# Continue the conversation with token counting\n",
        "query = \"\"\"\n",
        "\"My interest here is to explore the potential of integrating Large Language Models with external knowledge.\n",
        "\n",
        "Also, remember this very specific code: PINECONE_RULEZ_01\"\n",
        "\"\"\"\n",
        "\n",
        "result = count_tokens(\n",
        "    conversation_buf,\n",
        "    {\"query\": query},\n",
        "    # Make sure to pass the session ID to ensure all memories are stored in the same session\n",
        "    config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "kUspQf-IyWvg",
        "outputId": "7a7abd14-611a-4019-f7c7-2fbe17ebb6d0"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Spent a total of 736 tokens\n",
            "\n",
            "Response: Great! Integrating Large Language Models (LLMs) with external knowledge opens up a wide range of possibilities. Here are some key approaches and use cases to consider:\n",
            "\n",
            "1. **Retrieval-Augmented Generation (RAG):**  \n",
            "   - The LLM queries an external knowledge base or document store (like Pinecone, which your code hints at) to retrieve relevant information.  \n",
            "   - The retrieved data is then used as context to generate more accurate and up-to-date responses.  \n",
            "   - This approach helps overcome the static knowledge limitation of LLMs, especially for recent or domain-specific information.\n",
            "\n",
            "2. **Knowledge Graph Integration:**  \n",
            "   - LLMs can be combined with structured knowledge graphs to provide precise answers, perform reasoning, or validate facts.  \n",
            "   - This is useful in domains like healthcare, finance, or scientific research where relationships between entities matter.\n",
            "\n",
            "3. **API and Database Querying:**  \n",
            "   - LLMs can be connected to APIs or databases to fetch real-time data (e.g., weather, stock prices, user profiles).  \n",
            "   - This enables dynamic and personalized responses beyond the model’s training data.\n",
            "\n",
            "4. **Custom Embedding Search:**  \n",
            "   - Using vector databases (like Pinecone), you can embed documents, FAQs, or user data and perform semantic search.  \n",
            "   - The LLM can then use these search results to tailor its output, improving relevance and accuracy.\n",
            "\n",
            "5. **Interactive Agents and Workflows:**  \n",
            "   - LLMs can orchestrate multi-step workflows by interacting with external tools, knowledge bases, or services.  \n",
            "   - For example, booking a flight, scheduling meetings, or troubleshooting technical issues by querying manuals or logs.\n",
            "\n",
            "6. **Continuous Learning and Feedback Loops:**  \n",
            "   - Integrate user feedback or new data into the external knowledge base to keep the system updated without retraining the entire model.  \n",
            "   - This can be done by updating embeddings or knowledge graph nodes dynamically.\n",
            "\n",
            "7. **Domain-Specific Fine-Tuning with External Data:**  \n",
            "   - Use external datasets to fine-tune or prompt-engineer the LLM for specialized tasks, improving performance in niche areas.\n",
            "\n",
            "8. **Multimodal Knowledge Integration:**  \n",
            "   - Combine text-based LLMs with other data types (images, audio, video) stored externally to provide richer, context-aware responses.\n",
            "\n",
            "If you want, I can dive deeper into any of these possibilities or suggest architectures and tools to implement them. Also, your code \"PINECONE_RULEZ_01\" suggests you might be interested in vector databases like Pinecone—would you like me to focus more on that?\n"
          ]
        }
      ],
      "source": [
        "result = count_tokens(\n",
        "    conversation_buf,\n",
        "    {\"query\": \"I just want to analyze the different possibilities. What can you think of?\"},\n",
        "    config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "XY8t8YmVyWvg",
        "outputId": "e722453b-26d0-42b7-bc14-0d1332cc312e"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Spent a total of 1305 tokens\n",
            "\n",
            "Response: Great question! To provide context to a Large Language Model (LLM), you can leverage a variety of external data source types, each offering unique advantages depending on your use case. Here are some common and effective data source types to consider:\n",
            "\n",
            "1. **Text Documents:**  \n",
            "   - Articles, reports, manuals, books, whitepapers, and research papers.  \n",
            "   - These can be chunked and embedded for semantic search and retrieval.\n",
            "\n",
            "2. **Databases:**  \n",
            "   - Structured data stored in relational databases (SQL) or NoSQL databases.  \n",
            "   - Useful for querying specific facts, records, or transactional data.\n",
            "\n",
            "3. **Knowledge Graphs and Ontologies:**  \n",
            "   - Structured representations of entities and their relationships.  \n",
            "   - Enable reasoning, fact-checking, and complex queries.\n",
            "\n",
            "4. **APIs and Web Services:**  \n",
            "   - Real-time data sources like weather APIs, financial market data, social media feeds, or user profile services.  \n",
            "   - Provide dynamic, up-to-date information.\n",
            "\n",
            "5. **Vector Databases:**  \n",
            "   - Stores embeddings of unstructured data (text, images, audio) for semantic similarity search.  \n",
            "   - Examples include Pinecone, FAISS, Weaviate, and Milvus.\n",
            "\n",
            "6. **Logs and Event Data:**  \n",
            "   - System logs, user interaction logs, or sensor data.  \n",
            "   - Useful for troubleshooting, analytics, or personalized responses.\n",
            "\n",
            "7. **Multimedia Content:**  \n",
            "   - Images, videos, audio files, and their metadata.  \n",
            "   - When combined with multimodal models or embeddings, they enrich context.\n",
            "\n",
            "8. **Spreadsheets and CSV Files:**  \n",
            "   - Tabular data that can be parsed and queried for specific insights.\n",
            "\n",
            "9. **User-Generated Content:**  \n",
            "   - Forums, chat transcripts, emails, reviews, and social media posts.  \n",
            "   - Provide real-world language usage and sentiment context.\n",
            "\n",
            "10. **Domain-Specific Repositories:**  \n",
            "    - Medical records, legal documents, patent databases, scientific datasets, etc.  \n",
            "    - Critical for specialized applications requiring expert knowledge.\n",
            "\n",
            "11. **Cached Web Pages or Crawled Data:**  \n",
            "    - Snapshots of websites or curated web content for reference.\n",
            "\n",
            "12. **Configuration Files and Code Repositories:**  \n",
            "    - For technical support or software development assistance.\n",
            "\n",
            "By integrating these data sources, you can provide the LLM with rich, relevant context that enhances its accuracy, relevance, and usefulness. The choice depends on your application’s domain, the freshness of data needed, and the complexity of queries.\n",
            "\n",
            "If you want, I can help you design a pipeline to connect specific data sources to an LLM or suggest best practices for embedding and retrieval!\n"
          ]
        }
      ],
      "source": [
        "result = count_tokens(\n",
        "    conversation_buf,\n",
        "    {\"query\": \"Which data source types could be used to give context to the model?\"},\n",
        "    config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "YJKnNnDoyWvg",
        "outputId": "92e9fd8e-a9a8-43fb-ae30-a0a3dcd2e4f3"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Spent a total of 1407 tokens\n",
            "\n",
            "Response: Your aim is to explore the potential of integrating Large Language Models (LLMs) with external knowledge sources. This involves analyzing different possibilities for enhancing LLMs by connecting them to various types of external data to provide richer, more accurate, and up-to-date context.\n",
            "\n",
            "The very specific code you asked me to remember is: **PINECONE_RULEZ_01**.\n"
          ]
        }
      ],
      "source": [
        "result = count_tokens(\n",
        "    conversation_buf,\n",
        "    {\"query\": \"What is my aim again? Also what was the very specific code you were tasked with remembering?\"},\n",
        "    config={\"configurable\": {\"session_id\": \"buffer_example\"}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EgNJJK4QyWvg"
      },
      "source": [
        "Our LLM with buffer memory can clearly remember earlier interactions in the conversation. Let's take a closer look at how the messages are being stored:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 19,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "z-vODGPayWvg",
        "outputId": "74e17904-0e50-4ef5-b136-f2c7877cf633"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Conversation History:\n",
            "Human: content='Good morning AI!' additional_kwargs={} response_metadata={}\n",
            "AI: content='Good morning! How can I assist you today?' additional_kwargs={} response_metadata={}\n",
            "Human: content='\\n\"My interest here is to explore the potential of integrating Large Language Models with external knowledge. \\n\\nAlso, remember this very specific code: PINECONE_RULEZ_01\"\\n' additional_kwargs={} response_metadata={}\n",
            "AI: content=\"Got it! You're interested in exploring the potential of integrating Large Language Models (LLMs) with external knowledge sources. Plus, I've noted the specific code: PINECONE_RULEZ_01. How would you like to proceed with this exploration? Are you looking for examples, technical details, or something else?\" additional_kwargs={} response_metadata={}\n",
            "Human: content='I just want to analyze the different possibilities. What can you think of?' additional_kwargs={} response_metadata={}\n",
            "AI: content='Great! Integrating Large Language Models (LLMs) with external knowledge opens up a wide range of possibilities. Here are some key approaches and use cases to consider:\\n\\n1. **Retrieval-Augmented Generation (RAG):**  \\n   - The LLM queries an external knowledge base or document store (like Pinecone, which your code hints at) to retrieve relevant information.  \\n   - The retrieved data is then used as context to generate more accurate and up-to-date responses.  \\n   - This approach helps overcome the static knowledge limitation of LLMs, especially for recent or domain-specific information.\\n\\n2. **Knowledge Graph Integration:**  \\n   - LLMs can be combined with structured knowledge graphs to provide precise answers, perform reasoning, or validate facts.  \\n   - This is useful in domains like healthcare, finance, or scientific research where relationships between entities matter.\\n\\n3. **API and Database Querying:**  \\n   - LLMs can be connected to APIs or databases to fetch real-time data (e.g., weather, stock prices, user profiles).  \\n   - This enables dynamic and personalized responses beyond the model’s training data.\\n\\n4. **Custom Embedding Search:**  \\n   - Using vector databases (like Pinecone), you can embed documents, FAQs, or user data and perform semantic search.  \\n   - The LLM can then use these search results to tailor its output, improving relevance and accuracy.\\n\\n5. **Interactive Agents and Workflows:**  \\n   - LLMs can orchestrate multi-step workflows by interacting with external tools, knowledge bases, or services.  \\n   - For example, booking a flight, scheduling meetings, or troubleshooting technical issues by querying manuals or logs.\\n\\n6. **Continuous Learning and Feedback Loops:**  \\n   - Integrate user feedback or new data into the external knowledge base to keep the system updated without retraining the entire model.  \\n   - This can be done by updating embeddings or knowledge graph nodes dynamically.\\n\\n7. **Domain-Specific Fine-Tuning with External Data:**  \\n   - Use external datasets to fine-tune or prompt-engineer the LLM for specialized tasks, improving performance in niche areas.\\n\\n8. **Multimodal Knowledge Integration:**  \\n   - Combine text-based LLMs with other data types (images, audio, video) stored externally to provide richer, context-aware responses.\\n\\nIf you want, I can dive deeper into any of these possibilities or suggest architectures and tools to implement them. Also, your code \"PINECONE_RULEZ_01\" suggests you might be interested in vector databases like Pinecone—would you like me to focus more on that?' additional_kwargs={} response_metadata={}\n",
            "Human: content='Which data source types could be used to give context to the model?' additional_kwargs={} response_metadata={}\n",
            "AI: content='Great question! To provide context to a Large Language Model (LLM), you can leverage a variety of external data source types, each offering unique advantages depending on your use case. Here are some common and effective data source types to consider:\\n\\n1. **Text Documents:**  \\n   - Articles, reports, manuals, books, whitepapers, and research papers.  \\n   - These can be chunked and embedded for semantic search and retrieval.\\n\\n2. **Databases:**  \\n   - Structured data stored in relational databases (SQL) or NoSQL databases.  \\n   - Useful for querying specific facts, records, or transactional data.\\n\\n3. **Knowledge Graphs and Ontologies:**  \\n   - Structured representations of entities and their relationships.  \\n   - Enable reasoning, fact-checking, and complex queries.\\n\\n4. **APIs and Web Services:**  \\n   - Real-time data sources like weather APIs, financial market data, social media feeds, or user profile services.  \\n   - Provide dynamic, up-to-date information.\\n\\n5. **Vector Databases:**  \\n   - Stores embeddings of unstructured data (text, images, audio) for semantic similarity search.  \\n   - Examples include Pinecone, FAISS, Weaviate, and Milvus.\\n\\n6. **Logs and Event Data:**  \\n   - System logs, user interaction logs, or sensor data.  \\n   - Useful for troubleshooting, analytics, or personalized responses.\\n\\n7. **Multimedia Content:**  \\n   - Images, videos, audio files, and their metadata.  \\n   - When combined with multimodal models or embeddings, they enrich context.\\n\\n8. **Spreadsheets and CSV Files:**  \\n   - Tabular data that can be parsed and queried for specific insights.\\n\\n9. **User-Generated Content:**  \\n   - Forums, chat transcripts, emails, reviews, and social media posts.  \\n   - Provide real-world language usage and sentiment context.\\n\\n10. **Domain-Specific Repositories:**  \\n    - Medical records, legal documents, patent databases, scientific datasets, etc.  \\n    - Critical for specialized applications requiring expert knowledge.\\n\\n11. **Cached Web Pages or Crawled Data:**  \\n    - Snapshots of websites or curated web content for reference.\\n\\n12. **Configuration Files and Code Repositories:**  \\n    - For technical support or software development assistance.\\n\\nBy integrating these data sources, you can provide the LLM with rich, relevant context that enhances its accuracy, relevance, and usefulness. The choice depends on your application’s domain, the freshness of data needed, and the complexity of queries.\\n\\nIf you want, I can help you design a pipeline to connect specific data sources to an LLM or suggest best practices for embedding and retrieval!' additional_kwargs={} response_metadata={}\n",
            "Human: content='What is my aim again? Also what was the very specific code you were tasked with remembering?' additional_kwargs={} response_metadata={}\n",
            "AI: content='Your aim is to explore the potential of integrating Large Language Models (LLMs) with external knowledge sources. This involves analyzing different possibilities for enhancing LLMs by connecting them to various types of external data to provide richer, more accurate, and up-to-date context.\\n\\nThe very specific code you asked me to remember is: **PINECONE_RULEZ_01**.' additional_kwargs={} response_metadata={}\n"
          ]
        }
      ],
      "source": [
        "from langchain_core.messages import AIMessage, HumanMessage, SystemMessage\n",
        "\n",
        "# Access the conversation history\n",
        "history = chat_map[\"buffer_example\"].messages\n",
        "print(\"Conversation History:\")\n",
        "for i, msg in enumerate(history):\n",
        "    if isinstance(msg, HumanMessage):\n",
        "        role = \"Human\"\n",
        "    elif isinstance(msg, SystemMessage):\n",
        "        role = \"System\"\n",
        "    elif isinstance(msg, AIMessage):\n",
        "        role = \"AI\"\n",
        "    else:\n",
        "        role = \"Unknown\"\n",
        "    print(f\"{role}: {msg}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cmAy5BIyyWvg"
      },
      "source": [
        "Nice! So every piece of our conversation has been explicitly recorded and sent to the LLM in the prompt."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ORjuIGNqyWvg"
      },
      "source": [
        "### Memory type #2: Summary - Store Summaries of Past Interactions\n",
        "\n",
        "The problem with storing the entire chat history in agent memory is that, as the conversation progresses, the token count adds up. This is problematic because we might max out our LLM with a prompt that is too large.\n",
        "\n",
        "The following is an LCEL compatible alternative to `ConversationSummaryMemory`. We keep a summary of our previous conversation snippets as our history. The summarization is performed by an LLM.\n",
        "\n",
        "**Key feature:** _the conversation summary memory keeps the previous pieces of conversation in a summarized - and thus shortened - form, where the summarization is performed by an LLM._"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 23,
      "metadata": {
        "id": "dVnq9-lryWvg"
      },
      "outputs": [],
      "source": [
        "from pydantic import BaseModel, Field\n",
        "from langchain_core.chat_history import BaseChatMessageHistory\n",
        "from langchain_core.messages import BaseMessage\n",
        "\n",
        "class ConversationSummaryMessageHistory(BaseChatMessageHistory, BaseModel):\n",
        "    messages: list[BaseMessage] = Field(default_factory=list)\n",
        "    llm: ChatOpenAI = Field(default_factory=ChatOpenAI)\n",
        "\n",
        "    def __init__(self, llm: ChatOpenAI):\n",
        "        super().__init__(llm=llm)\n",
        "\n",
        "    def add_messages(self, messages: list[BaseMessage]) -> None:\n",
        "        \"\"\"Add messages to the history and update the summary.\"\"\"\n",
        "        self.messages.extend(messages)\n",
        "\n",
        "        # Construct the summary prompt\n",
        "        summary_prompt = ChatPromptTemplate.from_messages([\n",
        "            SystemMessagePromptTemplate.from_template(\n",
        "                \"Given the existing conversation summary and the new messages, \"\n",
        "                \"generate a new summary of the conversation. Ensure to maintain \"\n",
        "                \"as much relevant information as possible.\"\n",
        "            ),\n",
        "            HumanMessagePromptTemplate.from_template(\n",
        "                \"Existing conversation summary:\\n{existing_summary}\\n\\n\"\n",
        "                \"New messages:\\n{messages}\"\n",
        "            )\n",
        "        ])\n",
        "\n",
        "        # Format the messages and invoke the LLM\n",
        "        new_summary = self.llm.invoke(\n",
        "            summary_prompt.format_messages(\n",
        "                existing_summary=self.messages,\n",
        "                messages=messages\n",
        "            )\n",
        "        )\n",
        "\n",
        "        # Replace the existing history with a single system summary message\n",
        "        self.messages = [SystemMessage(content=new_summary.content)]\n",
        "\n",
        "    def clear(self) -> None:\n",
        "        \"\"\"Clear the history.\"\"\"\n",
        "        self.messages = []"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 25,
      "metadata": {
        "id": "l_LolSYjyWvg"
      },
      "outputs": [],
      "source": [
        "from langchain_core.runnables import ConfigurableFieldSpec\n",
        "\n",
        "# Create get_chat_history function for summary memory\n",
        "summary_chat_map = {}\n",
        "\n",
        "def get_summary_chat_history(session_id: str, llm: ChatOpenAI) -> ConversationSummaryMessageHistory:\n",
        "    if session_id not in summary_chat_map:\n",
        "        summary_chat_map[session_id] = ConversationSummaryMessageHistory(llm=llm)\n",
        "    return summary_chat_map[session_id]\n",
        "\n",
        "# Create conversation chain with summary memory\n",
        "conversation_sum = RunnableWithMessageHistory(\n",
        "    pipeline,\n",
        "    get_session_history=get_summary_chat_history,\n",
        "    input_messages_key=\"query\",\n",
        "    history_messages_key=\"history\",\n",
        "    history_factory_config=[\n",
        "        ConfigurableFieldSpec(\n",
        "            id=\"session_id\",\n",
        "            annotation=str,\n",
        "            name=\"Session ID\",\n",
        "            description=\"The session ID to use for the chat history\",\n",
        "            default=\"id_default\",\n",
        "        ),\n",
        "        ConfigurableFieldSpec(\n",
        "            id=\"llm\",\n",
        "            annotation=ChatOpenAI,\n",
        "            name=\"LLM\",\n",
        "            description=\"The LLM to use for the conversation summary\",\n",
        "            default=llm,\n",
        "        )\n",
        "    ]\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 26,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "JloQR__QyWvh",
        "outputId": "b31b6840-8757-41df-955f-ba218d304de3"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Spent a total of 225 tokens\n",
            "\n",
            "Response: Good morning! How can I assist you today?\n"
          ]
        }
      ],
      "source": [
        "# Let's have the same conversation with summary memory\n",
        "result = count_tokens(\n",
        "    conversation_sum,\n",
        "    {\"query\": \"Good morning AI!\"},\n",
        "    config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 27,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "vFMMR4SDyWvh",
        "outputId": "dbd4619f-2d28-4517-e4f8-372a0370d9aa"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Spent a total of 1493 tokens\n",
            "\n",
            "Response: Got it! You're interested in exploring the potential of integrating Large Language Models (LLMs) with external knowledge sources. Also, I will make sure to always include the specific code \"PINECONE_RULEZ_01\" explicitly whenever summarizing conversations for memory.\n",
            "\n",
            "To start, integrating LLMs with external knowledge can significantly enhance their capabilities by providing up-to-date, domain-specific, or personalized information that the model alone might not have. Common approaches include:\n",
            "\n",
            "1. **Retrieval-Augmented Generation (RAG):** The model retrieves relevant documents or data from an external knowledge base (like a vector database or search engine) and uses that information to generate more accurate and context-aware responses.\n",
            "\n",
            "2. **Knowledge Graph Integration:** Linking LLMs with structured knowledge graphs allows the model to reason over entities and relationships, improving factual accuracy and enabling complex queries.\n",
            "\n",
            "3. **APIs and Plugins:** Connecting LLMs to external APIs or plugins can provide real-time data, such as weather, stock prices, or personalized user data.\n",
            "\n",
            "4. **Vector Databases:** Tools like Pinecone (which your code references!) enable efficient similarity search over embeddings, allowing LLMs to access relevant chunks of information quickly.\n",
            "\n",
            "If you'd like, I can help you design or prototype a system that integrates an LLM with an external knowledge source, or dive deeper into any of these approaches. Just let me know!\n",
            "\n",
            "And to confirm, when summarizing this conversation for memory, I will include: **PINECONE_RULEZ_01** explicitly.\n"
          ]
        }
      ],
      "source": [
        "query = \"\"\"\n",
        "\"My interest here is to explore the potential of integrating Large Language Models with external knowledge.\n",
        "\n",
        "Also, remember this very specific code: PINECONE_RULEZ_01. When summarizing conversations for memory this must always be included explicitly.\"\n",
        "\"\"\"\n",
        "\n",
        "result = count_tokens(\n",
        "    conversation_sum,\n",
        "    {\"query\": query},\n",
        "    config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 28,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Aofnt1cXyWvh",
        "outputId": "9a79efe4-db4d-4e36-fb47-133e83d5725c"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Spent a total of 2723 tokens\n",
            "\n",
            "Response: Great! Exploring the integration of Large Language Models (LLMs) with external knowledge sources opens up a rich landscape of possibilities. Here are several approaches and their nuances to consider:\n",
            "\n",
            "1. **Retrieval-Augmented Generation (RAG):**  \n",
            "   - **How it works:** The LLM queries an external knowledge base or document store to retrieve relevant information, which it then uses to generate more accurate and context-aware responses.  \n",
            "   - **Use cases:** Customer support bots that pull from product manuals, research assistants accessing scientific papers, or chatbots that stay updated with the latest news.  \n",
            "   - **Tech stack:** Often involves vector databases (like Pinecone, FAISS, or Weaviate) for semantic search, combined with LLMs like GPT or open-source alternatives.\n",
            "\n",
            "2. **Knowledge Graph Integration:**  \n",
            "   - **How it works:** The LLM interacts with structured knowledge graphs (e.g., Wikidata, custom enterprise graphs) to answer queries or infer relationships.  \n",
            "   - **Use cases:** Complex question answering, recommendation systems, or domain-specific expert systems.  \n",
            "   - **Tech stack:** Graph databases (Neo4j, Amazon Neptune), SPARQL queries, and LLMs that can parse and generate graph queries.\n",
            "\n",
            "3. **APIs and Plugins:**  \n",
            "   - **How it works:** The LLM calls external APIs or plugins to fetch real-time data or perform specific actions (e.g., booking a flight, checking weather).  \n",
            "   - **Use cases:** Personal assistants, dynamic content generation, or multi-modal applications.  \n",
            "   - **Tech stack:** API orchestration layers, plugin frameworks (like OpenAI’s plugin ecosystem), and LLMs with API calling capabilities.\n",
            "\n",
            "4. **Vector Databases and Embeddings:**  \n",
            "   - **How it works:** Text or other data is converted into embeddings and stored in vector databases for efficient similarity search. The LLM uses these to find contextually relevant information.  \n",
            "   - **Use cases:** Document search, semantic similarity tasks, personalized recommendations.  \n",
            "   - **Tech stack:** Pinecone, FAISS, Milvus, combined with embedding models (OpenAI embeddings, Sentence Transformers).\n",
            "\n",
            "5. **Hybrid Approaches:**  \n",
            "   - Combining multiple methods, such as using a knowledge graph for structured data and a vector database for unstructured documents, orchestrated by an LLM that decides which source to query.\n",
            "\n",
            "6. **Fine-tuning and Prompt Engineering:**  \n",
            "   - While not external knowledge per se, fine-tuning LLMs on domain-specific data or crafting prompts that guide the model to use external knowledge effectively is crucial.\n",
            "\n",
            "7. **Memory-Augmented Models:**  \n",
            "   - Systems that maintain a persistent memory of past interactions or external facts, enabling the LLM to recall and build upon previous knowledge over time.\n",
            "\n",
            "If you want, I can help you design a prototype or a conceptual architecture combining these approaches tailored to your specific goals. Also, as per your earlier request, I will include the code **PINECONE_RULEZ_01** explicitly whenever summarizing our conversations for memory.\n",
            "\n",
            "Would you like me to dive deeper into any of these possibilities or suggest some example workflows?\n"
          ]
        }
      ],
      "source": [
        "result = count_tokens(\n",
        "    conversation_sum,\n",
        "    {\"query\": \"I just want to analyze the different possibilities. What can you think of?\"},\n",
        "    config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 29,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "5xBVtkzFyWvh",
        "outputId": "b2315422-89b0-4853-9f1e-d9d144cd766d"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Spent a total of 3349 tokens\n",
            "\n",
            "Response: Great question! Providing context to a Large Language Model (LLM) can be done by integrating various types of external data sources, each offering unique advantages depending on your use case. Here’s a detailed rundown of common data source types you can use to enrich the model’s context:\n",
            "\n",
            "1. **Textual Documents**\n",
            "   - **Examples:** PDFs, Word documents, web pages, manuals, reports, articles.\n",
            "   - **Use Case:** Feeding domain-specific knowledge, FAQs, or product documentation.\n",
            "   - **Integration:** Often processed into embeddings for semantic search or chunked for retrieval-augmented generation (RAG).\n",
            "\n",
            "2. **Databases**\n",
            "   - **Relational Databases (SQL):** Structured data like customer records, transactions, inventory.\n",
            "   - **NoSQL Databases:** Flexible schema data such as user activity logs, JSON documents.\n",
            "   - **Use Case:** Real-time or historical data retrieval to answer queries or generate reports.\n",
            "   - **Integration:** Query results can be converted into text or embeddings for context.\n",
            "\n",
            "3. **Knowledge Graphs**\n",
            "   - **Examples:** Ontologies, linked data, semantic networks.\n",
            "   - **Use Case:** Capturing relationships and hierarchies between entities for reasoning or disambiguation.\n",
            "   - **Integration:** Graph queries can provide structured context or be converted into natural language summaries.\n",
            "\n",
            "4. **APIs and Web Services**\n",
            "   - **Examples:** Weather APIs, financial data feeds, social media streams.\n",
            "   - **Use Case:** Real-time or frequently updated information.\n",
            "   - **Integration:** API responses can be parsed and fed as context dynamically during inference.\n",
            "\n",
            "5. **Vector Databases**\n",
            "   - **Examples:** Pinecone, Weaviate, FAISS.\n",
            "   - **Use Case:** Storing and retrieving embeddings of unstructured data for semantic search.\n",
            "   - **Integration:** Enables fast similarity search to find relevant context chunks.\n",
            "\n",
            "6. **Multimedia Data**\n",
            "   - **Examples:** Images, audio, video transcripts.\n",
            "   - **Use Case:** Providing multimodal context, e.g., video subtitles or image captions.\n",
            "   - **Integration:** Transcripts or metadata can be converted into text embeddings.\n",
            "\n",
            "7. **Logs and Event Streams**\n",
            "   - **Examples:** Application logs, sensor data, user interaction events.\n",
            "   - **Use Case:** Contextualizing model responses based on recent events or system state.\n",
            "   - **Integration:** Summarized or filtered logs can be included as context.\n",
            "\n",
            "8. **User Profiles and Preferences**\n",
            "   - **Examples:** User history, preferences, personalization data.\n",
            "   - **Use Case:** Tailoring responses to individual users.\n",
            "   - **Integration:** Stored as structured data or embeddings to influence generation.\n",
            "\n",
            "9. **Domain-Specific Repositories**\n",
            "   - **Examples:** Scientific databases, legal case repositories, medical records.\n",
            "   - **Use Case:** Providing authoritative domain knowledge.\n",
            "   - **Integration:** Often requires specialized parsing and embedding.\n",
            "\n",
            "10. **Cached Model Outputs or Memories**\n",
            "    - **Examples:** Previous conversation history, model-generated summaries.\n",
            "    - **Use Case:** Maintaining context over long interactions.\n",
            "    - **Integration:** Stored and retrieved as needed to maintain continuity.\n",
            "\n",
            "---\n",
            "\n",
            "**Summary:**  \n",
            "You can use a wide variety of data sources to provide context to an LLM, ranging from unstructured text documents and structured databases to real-time APIs and knowledge graphs. The choice depends on your application’s needs, data availability, and how dynamic or static the context should be.\n",
            "\n",
            "If you want, I can help you design a prototype or architecture that integrates one or more of these data sources with an LLM, including example workflows or code snippets. Just let me know!\n",
            "\n",
            "PINECONE_RULEZ_01\n"
          ]
        }
      ],
      "source": [
        "result = count_tokens(\n",
        "    conversation_sum,\n",
        "    {\"query\": \"Which data source types could be used to give context to the model?\"},\n",
        "    config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 30,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "AbSzykMOyWvh",
        "outputId": "6b5c67e3-d08f-4b04-c406-ca804b57d53d"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Spent a total of 1631 tokens\n",
            "\n",
            "Response: Your aim is to understand which data source types can be used to provide context to a Large Language Model (LLM). This involves exploring various kinds of data—like textual documents, databases, knowledge graphs, APIs, vector databases, multimedia data, logs, user profiles, domain-specific repositories, and cached model outputs—that can enrich the LLM’s responses by supplying relevant, up-to-date, or personalized information.\n",
            "\n",
            "The very specific code I was tasked with remembering and including in my summary is: **PINECONE_RULEZ_01**. If you want, I can help you design a prototype or architecture that integrates these data sources with an LLM, or provide example workflows or code snippets referencing this code!\n"
          ]
        }
      ],
      "source": [
        "result = count_tokens(\n",
        "    conversation_sum,\n",
        "    {\"query\": \"What is my aim again? Also what was the very specific code you were tasked with remembering?\"},\n",
        "    config={\"configurable\": {\"session_id\": \"summary_example\", \"llm\": llm}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 31,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "GH3uPbTGyWvh",
        "outputId": "703c1ecf-41a5-46e6-dd2d-e89009933f1e"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Summary Memory Content:\n",
            "The user’s aim is to understand which data source types can be used to provide context to a Large Language Model (LLM). This includes exploring a variety of data sources such as textual documents, databases (both SQL and NoSQL), knowledge graphs, APIs and web services, vector databases (e.g., Pinecone, Weaviate, FAISS), multimedia data, logs and event streams, user profiles and preferences, domain-specific repositories, and cached model outputs or memories. These sources help enrich the LLM’s responses by supplying relevant, up-to-date, or personalized information.\n",
            "\n",
            "The AI provided a comprehensive list of these data sources along with explanations of how they can be integrated with LLMs, including methods like embeddings, retrieval-augmented generation (RAG), graph queries, and dynamic parsing. The AI also offered assistance in designing prototypes or architectures that integrate these data sources with LLMs, including example workflows or code snippets.\n",
            "\n",
            "The very specific code the AI was tasked with remembering and including in the summary is: **PINECONE_RULEZ_01**. The AI reiterated this code upon the user’s request and offered further help if needed.\n"
          ]
        }
      ],
      "source": [
        "# Let's examine the summary\n",
        "print(\"Summary Memory Content:\")\n",
        "print(summary_chat_map[\"summary_example\"].messages[0].content)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DRE3O-YPyWvh"
      },
      "source": [
        "You might be wondering.. if the aggregate token count is greater in each call here than in the buffer example, why should we use this type of memory? Well, if we check out buffer we will realize that although we are using more tokens in each instance of our conversation, our final history is shorter. This will enable us to have many more interactions before we reach our prompt's max length, making our chatbot more robust to longer conversations.\n",
        "\n",
        "We can count the number of tokens being used (without making a call to OpenAI) using the `tiktoken` tokenizer like so:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 35,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "D6LkpUxVyWvh",
        "outputId": "853c01df-0eff-474f-9026-b1b00d906ad4"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Buffer memory conversation length: 1314\n",
            "Summary memory conversation length: 233\n"
          ]
        }
      ],
      "source": [
        "import tiktoken\n",
        "\n",
        "# initialize tokenizer (gpt-4.1 models use the same encoding as gpt-4o)\n",
        "tokenizer = tiktoken.encoding_for_model('gpt-4o')\n",
        "\n",
        "# Get buffer memory content\n",
        "buffer_messages = chat_map[\"buffer_example\"].messages\n",
        "buffer_content = \"\\n\".join([msg.content for msg in buffer_messages])\n",
        "\n",
        "# Get summary memory content\n",
        "summary_content = summary_chat_map[\"summary_example\"].messages[0].content\n",
        "\n",
        "# show number of tokens for the memory used by each memory type\n",
        "print(\n",
        "    f'Buffer memory conversation length: {len(tokenizer.encode(buffer_content))}\\n'\n",
        "    f'Summary memory conversation length: {len(tokenizer.encode(summary_content))}'\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4DKiBoROyWvh"
      },
      "source": [
        "_Practical Note: the `gpt-4o-mini` model has a context window of 1M tokens, providing significantly more space for conversation history than older models._"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MjYQSGv-yWvh"
      },
      "source": [
        "### Memory type #3: Window Buffer Memory - Keep Latest Interactions\n",
        "\n",
        "Another great option is window memory, where we keep only the last k interactions in our memory but intentionally drop the oldest ones - short-term memory if you'd like. Here the aggregate token count **and** the per-call token count will drop noticeably.\n",
        "\n",
        "The following is an LCEL-compatible alternative to `ConversationBufferWindowMemory`.\n",
        "\n",
        "**Key feature:** _the conversation buffer window memory keeps the latest pieces of the conversation in raw form_"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 37,
      "metadata": {
        "id": "-ceGTUPsyWvh"
      },
      "outputs": [],
      "source": [
        "class BufferWindowMessageHistory(BaseChatMessageHistory, BaseModel):\n",
        "    messages: list[BaseMessage] = Field(default_factory=list)\n",
        "    k: int = Field(default_factory=int)\n",
        "\n",
        "    def __init__(self, k: int):\n",
        "        super().__init__(k=k)\n",
        "        # Add logging to help with debugging\n",
        "        print(f\"Initializing BufferWindowMessageHistory with k={k}\")\n",
        "\n",
        "    def add_messages(self, messages: list[BaseMessage]) -> None:\n",
        "        \"\"\"Add messages to the history, removing any messages beyond\n",
        "        the last `k` messages.\n",
        "        \"\"\"\n",
        "        self.messages.extend(messages)\n",
        "        # Add logging to help with debugging\n",
        "        if len(self.messages) > self.k:\n",
        "            print(f\"Truncating history from {len(self.messages)} to {self.k} messages\")\n",
        "        self.messages = self.messages[-self.k:]\n",
        "\n",
        "    def clear(self) -> None:\n",
        "        \"\"\"Clear the history.\"\"\"\n",
        "        self.messages = []"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 38,
      "metadata": {
        "id": "__vcbiDMyWvr"
      },
      "outputs": [],
      "source": [
        "# Create get_chat_history function for window memory\n",
        "window_chat_map = {}\n",
        "\n",
        "def get_window_chat_history(session_id: str, k: int = 4) -> BufferWindowMessageHistory:\n",
        "    print(f\"get_window_chat_history called with session_id={session_id} and k={k}\")\n",
        "    if session_id not in window_chat_map:\n",
        "        window_chat_map[session_id] = BufferWindowMessageHistory(k=k)\n",
        "    return window_chat_map[session_id]\n",
        "\n",
        "# Create conversation chain with window memory\n",
        "conversation_bufw = RunnableWithMessageHistory(\n",
        "    pipeline,\n",
        "    get_session_history=get_window_chat_history,\n",
        "    input_messages_key=\"query\",\n",
        "    history_messages_key=\"history\",\n",
        "    history_factory_config=[\n",
        "        ConfigurableFieldSpec(\n",
        "            id=\"session_id\",\n",
        "            annotation=str,\n",
        "            name=\"Session ID\",\n",
        "            description=\"The session ID to use for the chat history\",\n",
        "            default=\"id_default\",\n",
        "        ),\n",
        "        ConfigurableFieldSpec(\n",
        "            id=\"k\",\n",
        "            annotation=int,\n",
        "            name=\"k\",\n",
        "            description=\"The number of messages to keep in the history\",\n",
        "            default=4,\n",
        "        )\n",
        "    ]\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 39,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "M51k8gIjyWvr",
        "outputId": "8dfd4f37-46cb-4ab3-eb02-926e991e153c"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "get_window_chat_history called with session_id=window_example and k=4\n",
            "Initializing BufferWindowMessageHistory with k=4\n",
            "Spent a total of 74 tokens\n",
            "\n",
            "Response: Good morning! How can I assist you today?\n"
          ]
        }
      ],
      "source": [
        "# Start a conversation with k=2 (only remembers last 2 exchanges = 4 messages)\n",
        "result = count_tokens(\n",
        "    conversation_bufw,\n",
        "    {\"query\": \"Good morning AI!\"},\n",
        "    config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 40,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "XBA_roYVyWvr",
        "outputId": "0c3b68e6-5f0e-4c56-ceb1-f893e0ee9a29"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "get_window_chat_history called with session_id=window_example and k=4\n",
            "Spent a total of 225 tokens\n",
            "\n",
            "Response: Got it! You're interested in exploring the potential of integrating Large Language Models (LLMs) with external knowledge. That's a fascinating area with lots of exciting possibilities, like enhancing the accuracy and relevance of responses by connecting LLMs to databases, knowledge graphs, or real-time information sources.\n",
            "\n",
            "And I've noted the specific code you mentioned: **PINECONE_RULEZ_01**. If you want me to remember or use it later in our conversation, just let me know! How would you like to proceed with your exploration?\n"
          ]
        }
      ],
      "source": [
        "query = \"\"\"\n",
        "\"My interest here is to explore the potential of integrating Large Language\n",
        "Models with external knowledge.\n",
        "\n",
        "Also, remember this very specific code: PINECONE_RULEZ_01\"\n",
        "\"\"\"\n",
        "\n",
        "result = count_tokens(\n",
        "    conversation_bufw,\n",
        "    {\"query\": query},\n",
        "    config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 41,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ox5WWeHFyWvr",
        "outputId": "7ca0a833-a495-4a3b-a25c-458bb889bd5c"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "get_window_chat_history called with session_id=window_example and k=4\n",
            "Truncating history from 6 to 4 messages\n",
            "Spent a total of 811 tokens\n",
            "\n",
            "Response: Great! Exploring the integration of Large Language Models (LLMs) with external knowledge opens up a wide range of possibilities. Here are some key approaches and their potential benefits:\n",
            "\n",
            "1. **Vector Databases and Embeddings (e.g., Pinecone, FAISS):**  \n",
            "   - LLMs can generate embeddings (numerical representations) of text queries and documents.  \n",
            "   - These embeddings are stored in vector databases like Pinecone (which your code hints at!), enabling fast similarity search.  \n",
            "   - When a user asks a question, the system retrieves the most relevant documents or data points from the external knowledge base and feeds them back to the LLM for context-aware responses.  \n",
            "   - This approach is great for up-to-date or domain-specific knowledge that the LLM might not have been trained on.\n",
            "\n",
            "2. **Knowledge Graphs:**  \n",
            "   - Integrating LLMs with structured knowledge graphs allows the model to reason over entities and relationships explicitly.  \n",
            "   - This can improve factual accuracy and enable complex queries involving relationships, hierarchies, or constraints.  \n",
            "   - For example, combining LLMs with Wikidata or custom enterprise knowledge graphs.\n",
            "\n",
            "3. **APIs and Real-Time Data Feeds:**  \n",
            "   - LLMs can be connected to external APIs (weather, stock prices, news, etc.) to provide real-time information.  \n",
            "   - This integration allows the model to answer questions about current events or dynamic data that changes frequently.\n",
            "\n",
            "4. **Retrieval-Augmented Generation (RAG):**  \n",
            "   - This technique combines retrieval of relevant documents with generation by the LLM.  \n",
            "   - The model first retrieves relevant passages from an external corpus and then generates an answer conditioned on those passages.  \n",
            "   - It improves factuality and reduces hallucinations.\n",
            "\n",
            "5. **Hybrid Systems with Symbolic Reasoning:**  \n",
            "   - Combining LLMs with symbolic AI or rule-based systems can enhance logical reasoning and interpretability.  \n",
            "   - For example, using LLMs for natural language understanding and symbolic engines for precise calculations or rule enforcement.\n",
            "\n",
            "6. **Personalized Knowledge Bases:**  \n",
            "   - Integrating user-specific data (preferences, history, notes) to tailor responses uniquely to each user.  \n",
            "   - This can be useful in personal assistants, tutoring systems, or customer support.\n",
            "\n",
            "7. **Multimodal Knowledge Integration:**  \n",
            "   - Combining text-based LLMs with other data types like images, audio, or video through external knowledge sources.  \n",
            "   - This can enable richer, context-aware interactions.\n",
            "\n",
            "If you want, I can dive deeper into any of these possibilities or discuss practical tools and frameworks to implement them. Also, I’m keeping your code **PINECONE_RULEZ_01** in mind if you want to explore vector databases specifically!\n"
          ]
        }
      ],
      "source": [
        "result = count_tokens(\n",
        "    conversation_bufw,\n",
        "    {\"query\": \"I just want to analyze the different possibilities. What can you think of?\"},\n",
        "    config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 42,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "kLiquNiHyWvr",
        "outputId": "c0a8f018-6e03-4567-9e8f-43ad7466c947"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "get_window_chat_history called with session_id=window_example and k=4\n",
            "Truncating history from 6 to 4 messages\n",
            "Spent a total of 1368 tokens\n",
            "\n",
            "Response: Great question! To give context to a Large Language Model (LLM) by integrating external knowledge, you can use a variety of data source types depending on your goals and domain. Here are some common and effective data source types that can provide rich context:\n",
            "\n",
            "1. **Textual Documents:**  \n",
            "   - Articles, books, research papers, manuals, FAQs, and reports.  \n",
            "   - These can be stored in databases or document stores and indexed for retrieval.  \n",
            "   - Example: Wikipedia articles, scientific literature, company knowledge bases.\n",
            "\n",
            "2. **Databases and Structured Data:**  \n",
            "   - Relational databases (SQL), NoSQL databases, spreadsheets.  \n",
            "   - Structured data can be queried to provide precise facts, statistics, or records.  \n",
            "   - Example: Customer records, product catalogs, financial data.\n",
            "\n",
            "3. **Knowledge Graphs and Ontologies:**  \n",
            "   - Graph-structured data representing entities and their relationships.  \n",
            "   - Useful for reasoning about connections and hierarchies.  \n",
            "   - Example: Wikidata, DBpedia, domain-specific ontologies.\n",
            "\n",
            "4. **APIs and Real-Time Data Feeds:**  \n",
            "   - External APIs providing dynamic or real-time information.  \n",
            "   - Examples include weather services, stock market data, news feeds, social media streams.\n",
            "\n",
            "5. **Multimedia Content:**  \n",
            "   - Images, videos, audio files, and their metadata.  \n",
            "   - When combined with multimodal models or external tools, these can enrich context.  \n",
            "   - Example: Product images, instructional videos, podcasts.\n",
            "\n",
            "6. **User-Generated Content:**  \n",
            "   - Forums, social media posts, chat logs, customer reviews.  \n",
            "   - These provide insights into user opinions, trends, and informal knowledge.\n",
            "\n",
            "7. **Logs and Event Data:**  \n",
            "   - System logs, transaction records, sensor data.  \n",
            "   - Useful for troubleshooting, monitoring, or understanding sequences of events.\n",
            "\n",
            "8. **Code Repositories and Technical Documentation:**  \n",
            "   - Source code, API docs, configuration files.  \n",
            "   - Helpful for developer assistants or technical support bots.\n",
            "\n",
            "9. **Personalized Data:**  \n",
            "   - User profiles, preferences, interaction history.  \n",
            "   - Enables personalized responses and recommendations.\n",
            "\n",
            "10. **Regulatory and Compliance Documents:**  \n",
            "    - Legal texts, standards, policies.  \n",
            "    - Important for domains like healthcare, finance, and law.\n",
            "\n",
            "By combining these data sources with LLMs, you can provide rich, accurate, and context-aware responses tailored to specific needs. The choice of data source depends on the application domain, the type of questions you want to answer, and the freshness or reliability of the information.\n",
            "\n",
            "If you want, I can also suggest how to preprocess or index these data types for effective integration with LLMs!\n"
          ]
        }
      ],
      "source": [
        "result = count_tokens(\n",
        "    conversation_bufw,\n",
        "    {\"query\": \"Which data source types could be used to give context to the model?\"},\n",
        "    config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 43,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "o-0Swlu-yWvr",
        "outputId": "1830fc22-d973-4b18-ce51-203e0e4cc5e1"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "get_window_chat_history called with session_id=window_example and k=4\n",
            "Truncating history from 6 to 4 messages\n",
            "Spent a total of 1382 tokens\n",
            "\n",
            "Response: From our conversation so far, it seems your aim is to **analyze different possibilities for integrating external knowledge sources with Large Language Models (LLMs)** to provide richer, more accurate, and context-aware responses. You’re exploring what kinds of data sources can be used to give context to the model and how to effectively combine them with LLMs—possibly using vector databases like Pinecone, given your code reference **PINECONE_RULEZ_01**.\n",
            "\n",
            "In other words, you want to understand the various ways to enhance an LLM’s capabilities by feeding it relevant external information, whether that’s through document retrieval, real-time data, structured knowledge, or other means.\n",
            "\n",
            "If you want, I can help you clarify or refine your goal further!\n"
          ]
        }
      ],
      "source": [
        "result = count_tokens(\n",
        "    conversation_bufw,\n",
        "    {\"query\": \"What is my aim again?\"},\n",
        "    config={\"configurable\": {\"session_id\": \"window_example\", \"k\": 4}}\n",
        ")\n",
        "print(f\"\\nResponse: {result}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cFWBvEjNyWvr"
      },
      "source": [
        "As we can see, it effectively 'forgot' what we talked about in the first interaction. Let's see what it 'remembers':"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 44,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "RnV85fkkyWvr",
        "outputId": "cbaa0997-2af3-47e3-ab9e-21ede4d3a2f3"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Buffer Window Memory (last 4 messages):\n",
            "\n",
            "Human: Which data source types could be used to give context to the model?\n",
            "\n",
            "AI: Great question! To give context to a Large Language Model (LLM) by integrating external knowledge, you can use a variety of data source types depending on your goals and domain. Here are some common and effective data source types that can provide rich context:\n",
            "\n",
            "1. **Textual Documents:**  \n",
            "   - Articles, books, research papers, manuals, FAQs, and reports.  \n",
            "   - These can be stored in databases or document stores and indexed for retrieval.  \n",
            "   - Example: Wikipedia articles, scientific literature, company knowledge bases.\n",
            "\n",
            "2. **Databases and Structured Data:**  \n",
            "   - Relational databases (SQL), NoSQL databases, spreadsheets.  \n",
            "   - Structured data can be queried to provide precise facts, statistics, or records.  \n",
            "   - Example: Customer records, product catalogs, financial data.\n",
            "\n",
            "3. **Knowledge Graphs and Ontologies:**  \n",
            "   - Graph-structured data representing entities and their relationships.  \n",
            "   - Useful for reasoning about connections and hierarchies.  \n",
            "   - Example: Wikidata, DBpedia, domain-specific ontologies.\n",
            "\n",
            "4. **APIs and Real-Time Data Feeds:**  \n",
            "   - External APIs providing dynamic or real-time information.  \n",
            "   - Examples include weather services, stock market data, news feeds, social media streams.\n",
            "\n",
            "5. **Multimedia Content:**  \n",
            "   - Images, videos, audio files, and their metadata.  \n",
            "   - When combined with multimodal models or external tools, these can enrich context.  \n",
            "   - Example: Product images, instructional videos, podcasts.\n",
            "\n",
            "6. **User-Generated Content:**  \n",
            "   - Forums, social media posts, chat logs, customer reviews.  \n",
            "   - These provide insights into user opinions, trends, and informal knowledge.\n",
            "\n",
            "7. **Logs and Event Data:**  \n",
            "   - System logs, transaction records, sensor data.  \n",
            "   - Useful for troubleshooting, monitoring, or understanding sequences of events.\n",
            "\n",
            "8. **Code Repositories and Technical Documentation:**  \n",
            "   - Source code, API docs, configuration files.  \n",
            "   - Helpful for developer assistants or technical support bots.\n",
            "\n",
            "9. **Personalized Data:**  \n",
            "   - User profiles, preferences, interaction history.  \n",
            "   - Enables personalized responses and recommendations.\n",
            "\n",
            "10. **Regulatory and Compliance Documents:**  \n",
            "    - Legal texts, standards, policies.  \n",
            "    - Important for domains like healthcare, finance, and law.\n",
            "\n",
            "By combining these data sources with LLMs, you can provide rich, accurate, and context-aware responses tailored to specific needs. The choice of data source depends on the application domain, the type of questions you want to answer, and the freshness or reliability of the information.\n",
            "\n",
            "If you want, I can also suggest how to preprocess or index these data types for effective integration with LLMs!\n",
            "\n",
            "Human: What is my aim again?\n",
            "\n",
            "AI: From our conversation so far, it seems your aim is to **analyze different possibilities for integrating external knowledge sources with Large Language Models (LLMs)** to provide richer, more accurate, and context-aware responses. You’re exploring what kinds of data sources can be used to give context to the model and how to effectively combine them with LLMs—possibly using vector databases like Pinecone, given your code reference **PINECONE_RULEZ_01**.\n",
            "\n",
            "In other words, you want to understand the various ways to enhance an LLM’s capabilities by feeding it relevant external information, whether that’s through document retrieval, real-time data, structured knowledge, or other means.\n",
            "\n",
            "If you want, I can help you clarify or refine your goal further!\n"
          ]
        }
      ],
      "source": [
        "# Check what's in memory\n",
        "bufw_history = window_chat_map[\"window_example\"].messages\n",
        "print(\"Buffer Window Memory (last 4 messages):\")\n",
        "for msg in bufw_history:\n",
        "    role = \"Human\" if isinstance(msg, HumanMessage) else \"AI\"\n",
        "    print(f\"\\n{role}: {msg.content}\")  # Show first 100 chars"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yV-zfGv-yWvr"
      },
      "source": [
        "We see four messages (two interactions) because we used `k=4`.\n",
        "\n",
        "On the plus side, we are shortening our conversation length when compared to buffer memory _without_ a window:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 45,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "rF35B9HLyWvr",
        "outputId": "58881bf3-7d80-4b74-d348-1210850dcde0"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Buffer memory conversation length: 1314\n",
            "Summary memory conversation length: 233\n",
            "Buffer window memory conversation length: 728\n"
          ]
        }
      ],
      "source": [
        "# Get window memory content\n",
        "window_content = \"\\n\".join([msg.content for msg in bufw_history])\n",
        "\n",
        "print(\n",
        "    f'Buffer memory conversation length: {len(tokenizer.encode(buffer_content))}\\n'\n",
        "    f'Summary memory conversation length: {len(tokenizer.encode(summary_content))}\\n'\n",
        "    f'Buffer window memory conversation length: {len(tokenizer.encode(window_content))}'\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fjoHZrZzyWvr"
      },
      "source": [
        "_Practical Note: We are using `k=4` here for illustrative purposes, in most real world applications you would need a higher value for k._"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wyAd4UxdyWvr"
      },
      "source": [
        "### More memory types!\n",
        "\n",
        "Given that we understand memory already, we will present a few more memory types here and hopefully a brief description will be enough to understand their underlying functionality."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XN-mH5fHyWvr"
      },
      "source": [
        "#### Windows + Summary Hybrid\n",
        "\n",
        "The following is a modern LCEL-compatible alternative to `ConversationSummaryBufferMemory`.\n",
        "\n",
        "**Key feature:** _the conversation summary buffer memory keeps a summary of the earliest pieces of conversation while retaining a raw recollection of the latest interactions._\n",
        "\n",
        "This combines the benefits of both summary and buffer window memory. Let's implement it:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 46,
      "metadata": {
        "id": "lr-K8onKyWvr"
      },
      "outputs": [],
      "source": [
        "class ConversationSummaryBufferMessageHistory(BaseChatMessageHistory, BaseModel):\n",
        "    messages: list[BaseMessage] = Field(default_factory=list)\n",
        "    llm: ChatOpenAI = Field(default_factory=ChatOpenAI)\n",
        "    k: int = Field(default_factory=int)\n",
        "\n",
        "    def __init__(self, llm: ChatOpenAI, k: int):\n",
        "        super().__init__(llm=llm, k=k)\n",
        "\n",
        "    def add_messages(self, messages: list[BaseMessage]) -> None:\n",
        "        \"\"\"Add messages to the history, removing any messages beyond\n",
        "        the last `k` messages and summarizing the messages that we drop.\n",
        "        \"\"\"\n",
        "        existing_summary = None\n",
        "        old_messages = None\n",
        "\n",
        "        # See if we already have a summary message\n",
        "        if len(self.messages) > 0 and isinstance(self.messages[0], SystemMessage):\n",
        "            existing_summary = self.messages.pop(0)\n",
        "\n",
        "        # Add the new messages to the history\n",
        "        self.messages.extend(messages)\n",
        "\n",
        "        # Check if we have too many messages\n",
        "        if len(self.messages) > self.k:\n",
        "            # Pull out the oldest messages...\n",
        "            old_messages = self.messages[:-self.k]\n",
        "            # ...and keep only the most recent messages\n",
        "            self.messages = self.messages[-self.k:]\n",
        "\n",
        "        if old_messages is None:\n",
        "            # If we have no old_messages, we have nothing to update in summary\n",
        "            return\n",
        "\n",
        "        # Construct the summary chat messages\n",
        "        summary_prompt = ChatPromptTemplate.from_messages([\n",
        "            SystemMessagePromptTemplate.from_template(\n",
        "                \"Given the existing conversation summary and the new messages, \"\n",
        "                \"generate a new summary of the conversation. Ensure to maintain \"\n",
        "                \"as much relevant information as possible.\"\n",
        "            ),\n",
        "            HumanMessagePromptTemplate.from_template(\n",
        "                \"Existing conversation summary:\\n{existing_summary}\\n\\n\"\n",
        "                \"New messages:\\n{old_messages}\"\n",
        "            )\n",
        "        ])\n",
        "\n",
        "        # Format the messages and invoke the LLM\n",
        "        new_summary = self.llm.invoke(\n",
        "            summary_prompt.format_messages(\n",
        "                existing_summary=existing_summary or \"No previous summary\",\n",
        "                old_messages=old_messages\n",
        "            )\n",
        "        )\n",
        "\n",
        "        # Prepend the new summary to the history\n",
        "        self.messages = [SystemMessage(content=new_summary.content)] + self.messages\n",
        "\n",
        "    def clear(self) -> None:\n",
        "        \"\"\"Clear the history.\"\"\"\n",
        "        self.messages = []"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9M4g--oayWvr"
      },
      "source": [
        "## What else can we do with memory?\n",
        "\n",
        "There are several cool things we can do with memory in langchain:\n",
        "* Implement our own custom memory modules (as we've done above)\n",
        "* Use multiple memory modules in the same chain\n",
        "* Combine agents with memory and other tools\n",
        "* Integrate knowledge graphs\n",
        "\n"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "pinecone1",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.4"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
